Skip to content

Conversation

AlexeySachkov
Copy link
Contributor

@AlexeySachkov AlexeySachkov commented Sep 16, 2025

The sycl_ext_oneapi_bindless_images extension specification documents multiple overloads for copying between bindless images depending on source and destination types (i.e. image mem handle to USM, USM to USM, etc.).

At Unified Runtime level, there is a single function urBindlessImagesImageCopyExp which implements all kinds of copies. However, at lower levels such as CUDA and Level Zero there are several different APIs for different kinds of copies. To understand which one to call, a "copy direction" argument is passed.

However, that "copy direction" information is useless for the L0 path, because L0 API is not based on a direction, but it is based on input types (mem handle vs USM). What we have today is tailored to CUDA and if we change copy directions to be tailored to L0, then CUDA breaks (because input types also matter there, but to lesser extent).

In order to fix all issues with bindless image copies and make them work across all adapters, a new argument is introduced into urBindlessImagesImageCopyExp to specify types of source and destination arguments. L0 UR adapter makes a decision about which API to call solely on this information, ignoring "copy direction".

CUDA & HIP adapters are left as-is - they still use "copy direction" information and sometimes try to guess the type of input arguments to call the right API. With the new argument introduced by this patch, those adapters could be refactored to reduce amount of guessing, but that is outside of the scope of this PR.

@AlexeySachkov AlexeySachkov changed the title [SYCL] Fix and re-enable bindless image copy tests on L0 [SYCL] Fix and enable bindless image copy tests on L0 Sep 16, 2025
@AlexeySachkov AlexeySachkov marked this pull request as ready for review September 16, 2025 13:58
@AlexeySachkov AlexeySachkov requested review from a team as code owners September 16, 2025 13:58
Copy link
Contributor

@againull againull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCL RT changes LGTM.
NIT: It would be nice to have unit tests to check that we pass proper input_types depending on ext_oneapi_copy usage, but since there are e2e tests and we don't have existing unit tests for this functionality, it seems excessive for this pr.

- name: MEM_TO_MEM
desc: "Memory to Memory"
- name: IMAGE_TO_IMAGE
desc: "Image handle to image handle"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I don't mind having a new enum, I do wonder if it would make sense to just fold it into ur__exp_image_copy_flags_t. You could effectively do it in 4 bits:

Bit 0 => 0 if source is host, 1 if source is device
Bit 1 => 0 if destination is host, 1 if destination is device
Bit 2 => 0 if input is mem, 1 if input is image
Bit 3 => 0 if output is mem, 1 if output is image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed possible and that is exactly kind of feedback/direction/preference I'm looking for from @intel/unified-runtime-reviewers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants